Module 3a: Image Compression


How can we represent the same amount of data while taking up less memory. Image Compression! Let's go through a simple example.

Let's begin with how our eye works.

"Crudely speaking, the human eyes is a spherical camera with a 20mm focal length...The iris controls the amount of light passing through the lens by controlling the size of the pupil...The retina is unevenly populated with sensor cells. An area near the center of the retina, called the fovea, has a very dense concentration of color receptors, called cones. Away from the center, the density of cones decreases while the density of black and white receptors, the rods, increases." (Computer Vision, Linda Shapiro, 2000)

In peripheral vision our visual system is interpreting color on top of the black and white information . The person sitting next to you might appear in color (you could have color blindness) but your eye is mainly capturing black and white (brightness) information. Since your peripheral vision has a limited amount of color receptors (cones), the color information needs to be shared among a large section of black and white (brightness) receptors (rods). Our brains combine the cone's color information with the neighboring rod's black and white information producing a color image.

Let's try to use our brains technique in an image compression algorithm

0 1 2
0 55,100,200 74,124,100 89,210,10
1 124,74,191 174,43,34 201,142,60
2 191,50,10 215,111,84 245,139,81

How much memory is used to store one color image of size 200 pixels by 200 pixels? Which contains 40,000 individual pixels, each containing three 8 bit Integers.

40,000 pixels x 24 bits (RGB) = 960,000 bits = 120,000 bytes = 0.12 megabytes

Simple approach. Two seperate arrays.

The first contains the brightness values of all the pixels. This can be saved as one value, one 8 Bit Integer, stored in a 2D array ([]). The second contains the RGB pixel values of every other row. This can be saved in 3 values, three 8 bit Integers, stored in a 3D array [[[]]].

1) All the Brightness values. GreyScale 2) Every other row of color vales.

40,000 pixels x 8 bits = 320,000 bits = 40,000 bytes = 0.04 megabytes
20,000 pixels x 24 bits = 480,000 bits = 60,000 bytes = 0.06 megabytes

We saved 0.02 megabytes! A 17% savings! Not bad at all for such a simple approach.

0 1 2
0 226,226,226 226,226,226 226,226,226
1 124 124 124
2 40,40,40 40,40,40 40,40,40

What is the size of the color and black & white array.

Compression (Encoder)

Convert to YCrCb color system.

In the YCbCr color model, the Y represents the luminance or brightness of the image, while the Cb and Cr represent the chrominance or color information. The Y channel is the grayscale or black-and-white part of the image, and the Cb and Cr channels represent the color information.

In the YCbCr color model, the red-difference (Cr) channel and the blue-difference (Cb) channel represent the color information that is separated from the luminance or brightness information.

The Cr channel represents the difference between the red component and the luminance value (Y), which can be calculated as:

Cr = R - Y

Similarly, the Cb channel represents the difference between the blue component and the luminance value (Y), which can be calculated as:

Cb = B - Y

The Cr and Cb channels contain information about the color and hue of the image, which can be used for various image processing and compression techniques. By separating the color information from the luminance information, the YCbCr color model allows for more efficient compression of digital images and videos while maintaining good image quality.

RGB YCrCb JPEG (or YCC)

Y0.299R+0.587G+0.114B

Cr(RY)0.713+delta

Cb(BY)0.564+delta

RY+1.403(Crdelta)

GY0.714(Crdelta)0.344(Cbdelta)

BY+1.773(Cbdelta)

where

delta=128327680.5for 8-bit imagesfor 16-bit imagesfor floating-point images

Y, Cr, and Cb cover the whole value range.

4:4:4, 4:2:2, & 4:2:0 color

Let's steal the Red Difference and the Blue Difference from the row above but keep the Luminance (brightness) from the odd rows.

Decoder

Exercises

(1) We implemented 4:2:2 image compression in class. Can you Implement 4:2:0 image compression?